49 research outputs found
The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors
What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation.
To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.Engineering and Applied Science
A NEaT Design for reliable and scalable network stacks
Operating systems provide a wide range of services, which are crucial for the increasingly high reliability and scalability demands of modern applications. Providing both reliability and scalability at the same time is hard. Commodity OS architectures simply lack the design abstractions to do so for demanding core OS services such as the network stack. For reliability and scalability guarantees, they rely almost exclusively on ensuring a high-quality implementation, rather than a reliable and scalable design. This results in complex error recovery paths and hard-to-maintain synchronization code. We demonstrate that a simple and structured design that strictly adheres to two principles, isolation and par- titioning, can yield reliable and scalable network stacks. We present NEaT, a system which partitions the stack across isolated process replicas handling independent requests. Our design principles intelligently partition the state to minimize the impact of failures (offering strong recovery guarantees) and to scale comparably to Linux without exposing the implementation to common pitfalls such as synchronization errors, poor locality, and false sharing
Scalable Address Spaces Using RCU Balanced Trees
Software developers commonly exploit multicore processors by building multithreaded software in which all threads of an application share a single address space. This shared address space has a cost: kernel virtual memory operations such as handling soft page faults, growing the address space, mapping files, etc. can limit the scalability of these applications. In widely-used operating systems, all of these operations are synchronized by a single per-process lock. This paper contributes a new design for increasing the concurrency of kernel operations on a shared address space by exploiting read-copy-update (RCU) so that soft page faults can both run in parallel with operations that mutate the same address space and avoid contending with other page faults on shared cache lines. To enable such parallelism, this paper also introduces an RCU-based binary balanced tree for storing memory mappings. An experimental evaluation using three multithreaded applications shows performance improvements on 80 cores ranging from 1.7 × to 3.4 × for an implementation of this design in the Linux 2.6.37 kernel. The RCU-based binary tree enables soft page faults to run at a constant cost with an increasing number of cores, suggesting that the design will scale well beyond 80 cores